134 research outputs found

    Bioinformatic analyses of mammalian 5'-UTR sequence properties of mRNAs predicts alternative translation initiation sites

    Get PDF
    <p>Abstract</p> <p>Background</p> <p>Utilization of alternative initiation sites for protein translation directed by non-AUG codons in mammalian mRNAs is observed with increasing frequency. Alternative initiation sites are utilized for the synthesis of important regulatory proteins that control distinct biological functions. It is, therefore, of high significance to define the parameters that allow accurate bioinformatic prediction of alternative translation initiation sites (aTIS). This study has investigated 5'-UTR regions of mRNAs to define consensus sequence properties and structural features that allow identification of alternative initiation sites for protein translation.</p> <p>Results</p> <p>Bioinformatic evaluation of 5'-UTR sequences of mammalian mRNAs was conducted for classification and identification of alternative translation initiation sites for a group of mRNA sequences that have been experimentally demonstrated to utilize alternative non-AUG initiation sites for protein translation. These are represented by the codons CUG, GUG, UUG, AUA, and ACG for aTIS. The first phase of this bioinformatic analysis implements a classification tree that evaluated 5'-UTRs for unique consensus sequence features near the initiation codon, characteristics of 5'-UTR nucleotide sequences, and secondary structural features in a decision tree that categorizes mRNAs into those with potential aTIS, and those without. The second phase addresses identification of the aTIS codon and its location. Critical parameters of 5'-UTRs were assessed by an Artificial Neural Network (ANN) for identification of the aTIS codon and its location. ANNs have previously been used for the purpose of AUG start site prediction and are applicable in complex. ANN analyses demonstrated that multiple properties were required for predicting aTIS codons; these properties included unique consensus nucleotide sequences at positions -7 and -6 combined with positions -3 and +4, 5'-UTR length, ORF length, predicted secondary structures, free energy features, upstream AUGs, and G/C ratio. Importantly, combined results of the classification tree and the ANN analyses provided highly accurate bioinformatic predictions of alternative translation initiation sites.</p> <p>Conclusion</p> <p>This study has defined the unique properties of 5'-UTR sequences of mRNAs for successful bioinformatic prediction of alternative initiation sites utilized in protein translation. The ability to define aTIS through the described bioinformatic analyses can be of high importance for genomic analyses to provide full predictions of translated mammalian and human gene products required for cellular functions in health and disease.</p

    Assessing the Gene Content of the Megagenome: Sugar Pine (Pinus lambertiana).

    Get PDF
    Sugar pine (Pinus lambertiana Douglas) is within the subgenus Strobus with an estimated genome size of 31 Gbp. Transcriptomic resources are of particular interest in conifers due to the challenges presented in their megagenomes for gene identification. In this study, we present the first comprehensive survey of the P. lambertiana transcriptome through deep sequencing of a variety of tissue types to generate more than 2.5 billion short reads. Third generation, long reads generated through PacBio Iso-Seq have been included for the first time in conifers to combat the challenges associated with de novo transcriptome assembly. A technology comparison is provided here to contribute to the otherwise scarce comparisons of second and third generation transcriptome sequencing approaches in plant species. In addition, the transcriptome reference was essential for gene model identification and quality assessment in the parallel project responsible for sequencing and assembly of the entire genome. In this study, the transcriptomic data were also used to address questions surrounding lineage-specific Dicer-like proteins in conifers. These proteins play a role in the control of transposable element proliferation and the related genome expansion in conifers

    Dual RNA-Seq analysis of the pine-Fusarium circinatum interaction in resistant (Pinus tecunumanii) and susceptible (Pinus patula) hosts

    Get PDF
    Fusarium circinatum poses a serious threat to many pine species in both commercial and natural pine forests. Knowledge regarding the molecular basis of pine-F. circinatum host-pathogen interactions could assist efforts to produce more resistant planting stock. This study aimed to identify molecular responses underlying resistance against F. circinatum. A dual RNA-seq approach was used to investigate host and pathogen expression in F. circinatum challenged Pinus tecunumanii (resistant) and Pinus patula (susceptible), at three- and seven-days post inoculation. RNA-seq reads were mapped to combined host-pathogen references for both pine species to identify differentially expressed genes (DEGs). F. circinatum genes expressed during infection showed decreased ergosterol biosynthesis in P. tecunumanii relative to P. patula. For P. tecunumanii, enriched gene ontologies and DEGs indicated roles for auxin-, ethylene-, jasmonate- and salicylate-mediated phytohormone signalling. Correspondingly, key phytohormone signaling components were down-regulated in P. patula. Key F. circinatum ergosterol biosynthesis genes were expressed at lower levels during infection of the resistant relative to the susceptible host. This study further suggests that coordination of phytohormone signaling is required for F. circinatum resistance in P. tecunumanii, while a comparatively delayed response and impaired phytohormone signaling contributes to susceptibility in P. patula.The National Research Foundation (NRF) of South Africa Scarce Skills Doctoral Scholarship Programme (Grant ID: 97892), the NRF Bioinformatics and Functional Genomics Programme (Grant IDs: 86936, 97911) and a strategic grant from the Department of Science and Technology (DST) for the Tree Genomics Platform at the University of Pretoria. Further support was provided by Sappi, Mondi, York Timbers and Hans Merensky Foundation though the Forest Molecular Genetics (FMG) Programme with co-funding from the Technology and Human Resources for Industry Programme (THRIP, Grant ID: 96413).http://www.mdpi.com/journal/microorganismsam2020BiochemistryForestry and Agricultural Biotechnology Institute (FABI)GeneticsMicrobiology and Plant Patholog

    Combined de novo and genome guided assembly and annotation of the Pinus patula juvenile shoot transcriptome

    Get PDF
    BACKGROUND : Pines are the most important tree species to the international forestry industry, covering 42 % of the global industrial forest plantation area. One of the most pressing threats to cultivation of some pine species is the pitch canker fungus, Fusarium circinatum, which can have devastating effects in both the field and nursery. Investigation of the Pinus-F. circinatum host-pathogen interaction is crucial for development of effective disease management strategies. As with many non-model organisms, investigation of host-pathogen interactions in pine species is hampered by limited genomic resources. This was partially alleviated through release of the 22 Gbp Pinus taeda v1.01 genome sequence (http://pinegenome.org/pinerefseq/) in 2014. Despite the fact that the fragmented state of the genome may hamper comprehensive transcriptome analysis, it is possible to leverage the inherent redundancy resulting from deep RNA sequencing with Illumina short reads to assemble transcripts in the absence of a completed reference sequence. These data can then be integrated with available genomic data to produce a comprehensive transcriptome resource. The aim of this study was to provide a foundation for gene expression analysis of disease response mechanisms in Pinus patula through transcriptome assembly. RESULTS : Eighteen de novo and two reference based assemblies were produced for P. patula shoot tissue. For this purpose three transcriptome assemblers, Trinity, Velvet/OASES and SOAPdenovo-Trans, were used to maximise diversity and completeness of assembled transcripts. Redundancy in the assembly was reduced using the EvidentialGene pipeline. The resulting 52 Mb P. patula v1.0 shoot transcriptome consists of 52 112 unigenes, 60 % of which could be functionally annotated. CONCLUSIONS : The assembled transcriptome will serve as a major genomic resource for future investigation of P. patula and represents the largest gene catalogue produced to date for this species. Furthermore, this assembly can help detect gene-based genetic markers for P. patula and the comparative assembly workflow could be applied to generate similar resources for other non-model species.Additional file 1: Table S1. EvidentialGene tr2aacds pipeline output summary.Additional file 2: Table S2. Assembly statistics for EvidentialGene tr2aacds pipeline merged assembly compared to average statistics for each assembler.Additional file 3: Table S3. Predicted species distribution for non-pine origin sequences removed from the Pinus patula v1.0 transcriptome.Additional file 4: Figure S1. Molecular function gene ontology distribution for the Pinus patula v1.0 transcriptome.Additional file 5: Table S4. Tribe-MCL gene families and annotations for all 15 species used.Additional file 6: Table S5. Conditional reciprocal best BLAST alignment results between full-length Sanger sequenced Pinus taeda cDNA and representative Pinus patula transcripts for each cDNA.Additional file 7: Figure S2. Summary statistics for alignment of Pinus taeda complete CDS sequences to assembled Pinus patula transcripts. Pita = P. taeda. The x-axis represents the query P. taeda cDNA sequence. The solid y-axis (left) illustrates: cDNA query sequence length (pink circle), P. patula subject sequence length (blue square), conditional reciprocal best BLAST alignment length (gold triangle). The dashed y-axis (right) depicts the: percentage identity between sequences (black line), percentage coverage of the P. taeda cDNA by the corresponding P. patula transcript (green cross) and vice versa (purple plus).Additional file 8: Table S6. EBSeq differential expression analysis results comparing expression between inoculated and mock-inoculated data.Additional file 9: Table S7. Summarized list of differentially expressed genes between inoculated and mock-inoculated data with annotations.Forestry South Africa (for seed funding), the Genomics Research Institute (GRI) and the National Research Foundation’s (NRF) Bioinformatics and Functional Genomics Programme (NBFG, UID:71255) as well as Innovation, Thuthuka and THRIP grants (Grant numbers: 84951, 86936, 87912).http://www.biomedcentral.com/bmcgenomicsam201

    The Douglas-Fir Genome Sequence Reveals Specialization of the Photosynthetic Apparatus in Pinaceae.

    Get PDF
    A reference genome sequence for Pseudotsuga menziesii var. menziesii (Mirb.) Franco (Coastal Douglas-fir) is reported, thus providing a reference sequence for a third genus of the family Pinaceae. The contiguity and quality of the genome assembly far exceeds that of other conifer reference genome sequences (contig N50 = 44,136 bp and scaffold N50 = 340,704 bp). Incremental improvements in sequencing and assembly technologies are in part responsible for the higher quality reference genome, but it may also be due to a slightly lower exact repeat content in Douglas-fir vs. pine and spruce. Comparative genome annotation with angiosperm species reveals gene-family expansion and contraction in Douglas-fir and other conifers which may account for some of the major morphological and physiological differences between the two major plant groups. Notable differences in the size of the NDH-complex gene family and genes underlying the functional basis of shade tolerance/intolerance were observed. This reference genome sequence not only provides an important resource for Douglas-fir breeders and geneticists but also sheds additional light on the evolutionary processes that have led to the divergence of modern angiosperms from the more ancient gymnosperms

    Gene Frequency Shift in Relict Abies pinsapo Forests Associated with Drought-Induced Mortality: Preliminary Evidence of Local-Scale Divergent Selection

    Get PDF
    Current climate change constitutes a challenge for the survival of several drought-sensitive forests. The study of the genetic basis of adaptation offers a suitable way to understand how tree species may respond to future climatic conditions, as well as to design suitable conservation and management strategies. Here, we focus on selected genetic signatures of the drought-sensitive relict fir, Abies pinsapo Boiss. Field sampling of 156 individuals was performed in two elevation ecotones, characterized by widespread A. pinsapo decline and mortality. The DNA from dead trees was investigated and compared to living individuals, accounting for different ages and elevations. We studied the genes gated outwardly-rectifying K+ (GORK) channel and Plasma membrane Intrinsic Protein (PIP1) aquaporin, previously related to drought response in plant model species, to test whether drought was the main abiotic factor driving the decline of A. pinsapo forests. A combination of linear regression and factor models were used to test these selection signatures, as well as a fixation index (Fst), used here to analyze the genetic structure. The results were consistent among these approaches, supporting a statistically significant association of the GORK gene with survival in one of the A. pinsapo populations. These results provide preliminary evidence for the potential role of the GORK gene in the resilience to drought of A. pinsapo
    • …
    corecore